Novel bioinformatics strategies for prediction of directional sequence changes in influenza virus genomes and for surveillance of potentially hazardous strains
نویسندگان
چکیده
BACKGROUND With the remarkable increase of microbial and viral sequence data obtained from high-throughput DNA sequencers, novel tools are needed for comprehensive analysis of the big sequence data. We have developed "Batch-Learning Self-Organizing Map (BLSOM)" which can characterize very many, even millions of, genomic sequences on one plane. Influenza virus is one of zoonotic viruses and shows clear host tropism. Important issues for bioinformatics studies of influenza viruses are prediction of genomic sequence changes in the near future and surveillance of potentially hazardous strains. METHODS To characterize sequence changes in influenza virus genomes after invasion into humans from other animal hosts, we applied BLSOMs to analyses of mono-, di-, tri-, and tetranucleotide compositions in all genome sequences of influenza A and B viruses and found clear host-dependent clustering (self-organization) of the sequences. RESULTS Viruses isolated from humans and birds differed in mononucleotide composition from each other. In addition, host-dependent oligonucleotide compositions that could not be explained with the host-dependent mononucleotide composition were revealed by oligonucleotide BLSOMs. Retrospective time-dependent directional changes of mono- and oligonucleotide compositions, which were visualized for human strains on BLSOMs, could provide predictive information about sequence changes in newly invaded viruses from other animal hosts (e.g. the swine-derived pandemic H1N1/09). CONCLUSIONS Basing on the host-dependent oligonucleotide composition, we proposed a strategy for prediction of directional changes of virus sequences and for surveillance of potentially hazardous strains when introduced into human populations from non-human sources. Millions of genomic sequences from infectious microbes and viruses have become available because of their medical and social importance, and BLSOM can characterize the big data and support efficient knowledge discovery.
منابع مشابه
Bioinformatics study of complete amino acid sequences of neuraminidase (NA) antigen of H1N1 influenza viruses from 2006 to 2013 in Iran
Introduction: Influenza is a contagious acute viral disease of the respiratory tract that causes fever, headache, muscle aches and cough. One of the unique features of influenza virus is antigenic variation in viral protein neuraminidase (NA) which causes emergence of new virus variants. NA is responsible for the release and spread of progeny virions. Due to the continuous changes of NA genes, ...
متن کاملPrediction of Directional Changes of Influenza A Virus Genome Sequences with Emphasis on Pandemic H1N1/09 as a Model Case
Influenza virus poses a significant threat to public health, as exemplified by the recent introduction of the new pandemic strain H1N1/09 into human populations. Pandemics have been initiated by the occurrence of novel changes in animal sources that eventually adapt to human. One important issue in studies of viral genomes, particularly those of influenza virus, is to predict possible changes i...
متن کاملDesigning of A Multi-epitope Recombinant Protein, Consisting of Several Conserved Epitopes from Hemagglutinin Protein of the H1N1 and H5N1 Strains of Influenza Virus by Immunoinformatics Approaches
Introduction: According to marked advances in bioinformatics studies, development of influenza vaccines has been greatly modified in many studies. In this study, we have designed a multi-epitope recombinant protein, consisting of several conserved epitopes from Hemagglutinin protein of the H1N1 and H5N1 strains of Influenza virus by immunoinformatics approaches. Materials and Methods: The regis...
متن کاملA Novel Bioinformatics Strategy to Analyze Microbial Big Sequence Data for Efficient Knowledge Discovery: Batch-Learning Self-Organizing Map (BLSOM)
With the remarkable increase of genomic sequence data of microorganisms, novel tools are needed for comprehensive analyses of the big sequence data available. The self-organizing map (SOM) is an effective tool for clustering and visualizing high-dimensional data, such as oligonucleotide composition on one map. By modifying the conventional SOM, we developed batch-learning SOM (BLSOM), which all...
متن کاملCloning, expression and purification of hemagglutinin conserved domain (HA2) of influenza A virus, to be used in broad-spectrum subunit vaccine cocktails
Introduction: Influenza virus has several conserved peptides which have the capacity to be used as suitable candidates for appropriate and stable vaccine production against different types of influenza viruses. One of these peptides is HA2, the hemagglutinin stalk domain which mediates the membrane fusion and is conserved amongst different sub-types of influenza virus. This peptide is a good ca...
متن کامل